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ABSTRACT: We perform a forecast of the MSSM with universal soft terms (CMSSM) for the 
LHC, based on an improved Bayesian analysis. We do not incorporate ad hoc measures of the 
fine-tuning to penalize unnatural possibilities: such penalization arises from the Bayesian analysis 
itself when the experimental value of Mz is considered. This allows to scan the whole parameter 
space, allowing arbitrarily large soft terms. Still the low-energy region is statistically favoured (even 
before including dark matter or g-2 constraints). Contrary to other studies, the results are almost 
unaffected by changing the upper limits taken for the soft terms. The results are also remarkable 
stable when using flat or logarithmic priors, a fact that arises from the larger statistical weight of 
the low-energy region in both cases. Then we incorporate all the important experimental constrains 
to the analysis, obtaining a map of the probability density of the MSSM parameter space, i.e. the 
forecast of the MSSM. Since not all the experimental information is equally robust, we perform 
separate analyses depending on the group of observables used. When only the most robust ones 
are used, the favoured region of the parameter space contains a significant portion outside the LHC 
reach. This effect gets reinforced if the Higgs mass is not close to its present experimental limit and 
persits when dark matter constraints are included. Only when the g-2 constraint (based on e + e~ 
data) is considered, the preferred region (for /i > 0) is well inside the LHC scope. We also perform 
a Bayesian comparison of the positive- and negative-^t possibilities. 
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1. Introduction 



The idea of an LHC forecast for the Minimal Supersymmetric Standard Model (MSSM) is 
to use all the present (theoretical and experimental) information available to determine the 
relative probability of the different regions of the MSSM parameter space. This includes 
theoretical constraints (and perhaps prejudices) and experimental constraints, such as elec- 
troweak precision tests. For recent work on this subject see refs. jl], ||, [|, ||, 0, ||, || |To|| . 
An appropriate framework to perform such forecast is the Bayesian approach (for a review 
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see ref. which allows a sensible statistical analysis and to separate in a neat way the 

objective and subjective pieces of information. 

The probability density of a particular point in the parameter space, say {9f}, given a 
certain set of data, is the so-called posterior probability density function (pdf), p(#9|data), 
which is given by the fundamental Bayesian relation 

p(0?|data) = p(data|0 f °) . (1.1) 

Here p(data|0?) is the likelihood (sometimes denoted by £), i.e. the probability density 
of measuring the given data for the chosen point in the parameter space 1 . p(0®) is the 
prior, i.e. the "theoretical" probability density that we assign a priori to the point in 
the parameter space. Finally, p(data) is a normalization factor which plays no role unless 
one wishes to compare different classes of models, so for the moment it can be dropped 
from the previous formula. One can say that in eq. (1.1) the first factor (the likelihood) is 



objective, while the second (the prior) contains our prejudices about how the probability 
is distributed a priori in the parameter space, given all our previous knowledge about the 
model. 

Ignoring the prior factor is not necessarily the most reasonable or "free of prejudices" 
attitude. Such procedure amounts to an implicit choice for the prior, namely a completely 
flat prior in the parameters. However, choosing e.g. Of as initial parameters instead of 0j, 
the previous flat prior becomes non-fiat. So one needs some theoretical basis to establish, 
at least, the parameters whose prior can be reasonably taken as flat. 

Besides, note that a choice for the allowed ranges of the various parameters is necessary 
in order to make statistical statements. Often one is interested in showing the probability 
density of one (or several) of the initial parameters, say 8i, i = 1, Ni, but not in the 
others, 0j, i = N± + 1, ...,N. Then, one has to marginalize the latter, i.e. integrate in the 
parameter space: 

p(0i, i = l,...,iVi|data) = J d6 Nl+1 , ...,d9 N p{0 u % = 1, iV|data) . (1.2) 

This procedure is very useful and common to make predictions about the values of partic- 
ularly interesting parameters. Now, in order to perform the marginalization, we need an 
input for the prior functions and for the range of allowed values of the parameters, which 
determines the range of the definite integration (1.2). A choice for these ingredients is 



therefore inescapable in trying to make LHC forecasts. 

In the present paper we apply these concepts to the study of the MSSM [jl^]. More 
precisely, we will consider a standard framework, often called CMSSM or MSUGRA, in 
which the soft parameters are assumed universal at a high scale (Mx), where the super- 
symmetry (SUSY) breaking is transmitted to the observable sector, as happens e.g. in the 



1 Frequentist approaches, which are an alternative to Bayesian ones, are based on the analysis of the 
likelihood function in the parameter space; see ref. QlOl] for a recent frequentist analysis of the MSSM. 
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gravity-mediated SUSY breaking scenario. Hence, our parameter-space is defined by the 
following parameters: 

{Oi} = {m,M,A,B,n,s} . (1.3) 

Here m, M and A are the universal scalar mass, gaugino mass and trilinear scalar coupling; 
B is the bilinear scalar coupling; \i is the usual Higgs mass term in the superpotential; and 
s stands for the SM-like parameters of the MSSM. The latter include the SU(3) x SU{2) x 
U(1)y gauge couplings, 93,9,9', and the Yukawa couplings, which in turn determine the 
fermion masses and mixing angles. 

In sect. 2 we explain the set up for this study. In our opinion we have improved previous 
analyses in several aspects. Namely, we have not made ad hoc assumptions to penalize 
fine-tuned regions. A nice consequence is the absence of dependences on the initial ranges 
for the MSSM parameters. Besides, we have done a rigorous treatment of the nuisance 
variables (in particular Yukawa couplings) and we have made a satisfactory choice of priors 
for the initial parameters (actually two different choices to evaluate the dependence on 
the prior). In sect. 3 we compare the relative probability of the high- and low-energy (i.e. 
accessible to LHC) regions of the MSSM parameter space. We show that, for any reasonable 
prior, the low-energy region is statistically favoured only after properly incorporating the 
information about the scale of electroweak breaking. Sections 4 and 5 are devoted to include 
all the important experimental constrains into the analysis, for positive and negative [i- 
parameter respectively. In this way, we obtain a map of the probability density of the 
MSSM parameter space, i.e. the MSSM forecast for the LHC. We distinguish between 
the most robust experimental data (EW observables, limits on masses of supersymmetric 
particles, etc.) and more controversial data (gn — 2) or model-dependent constraints (Dark 
Matter), performing separate analyses depending on the group of observables used. The 
comparison between the positive- and negative-// cases is done in sect. 5. In sect. 6 we 
present a summary of the analysis and the main conclusions. 



2. The set up for the scan of the MSSM 
2.1 Electroweak breaking 

The main motivation of low-energy SUSY is the nice implementation of the electroweak 
(EW) breaking, where the EW scale (or, equivalently, the Z mass) does not suffer from 
enormous (quadratic) radiative corrections. Actually, in the MSSM the EW breaking 
occurs naturally in a substantial part of the parameter space. This success is greatly due 
to the SUSY radiative contributions to the Higgs potential. Of course, in our analysis, the 
points of the parameter space that do not have a correct EW breaking are to be discarded, 
as usual. 

It is common lore that the parameters of the MSSM, {m, M, A, B, fj,}, should not be 
far from the experimental EW scale in order to avoid unnatural fine-tunings to obtain 
the correct size of the EW breaking. There is a rich literature |l^, 14, [l5|, |l^, 17] about 
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the best way to quantify this fine-tuning. It is normally understood that regions of the 
parameter space with large fine-tuning (typically at large values of the soft parameters) 
are to be considered unnatural and thus uninteresting. The exception to this rule are 
landscape-like scenarios, which we do not consider in detail in this paper. Previous Bayesian 
studies of the MSSM have attempted to incorporate this criterion by implementing some 
penalization of the fine-tuned regions, e.g. using a conveniently modified prior for the 
MSSM parameters 0, [l6|, M. Another (more usual) practice has been to restrict the range 
of the soft parameters to < few TeV, but this makes the results dependent on the actual 
ranges considered. 

However, since the naturalness arguments are deep down statistical arguments, one 
might expect that an effective penalization of fine-tunings should arise from the Bayesian 
analysis itself, with no need of introducing "naturalness priors" or restricting the soft terms 
to the low-energy scale. It was shown in ref. [18] that this is indeed the case (see also [17] 
for a previous observation in this sense). Let us briefly recall the argument. The key point 
is to consider M^ xp as experimental data on a similar foot to the others, entering the total 
likelihood, C. For the sake of simplicity let us approximate the likelihood associated to the 
Z mass as a Dirac delta, so 



p(data\s,m,M,A,B,ij,) ~ 6(M Z 



rest 



(2.1) 



where £ res t is the likelihood associated to all the physical observables, except Mz- Now, 
we can take advantage of this Dirac delta to marginalize the pdf in one of the initial 
parameters, e.g. fj,, performing a change of variable fi — > Mz- 



p(s,m, M, A, B\ data) = J dfi p(s, m, M, A, B, /i|data 



dM z 



rest 



dM z 
dfi 



dM z 



p(data|s, m, M, A, B, //) 
p(s,m,M,A,B,fi z ) ■ 



(2.2) 



l-'-z 



where [i z is the value of /i that reproduces the experimental value of M z for the given 
values of {s, m, M, A, B}, and p(s, m, M, A, B, /i) is the prior in the initial parameters (still 
undefined). This marginalized pdf can be written as 



p(s,m, M, A, B\ data) 



2 £rcst -jj- — p{s,m,M,A,B,fi ) 

M Z Cn 



(2.3) 



9 In Mf 



where = -^-^ is the conventional Barbieri-Giudice measure [13], [14[] of the degree of 
fine-tuning. Thus, the presence of this fine-tuning parameter in the denominator penalizes 
the regions of the parameter space with large fine-tuning, as desired. 

As we will see in full detail in sect. 3, this is enough to make the high-scale region of 
the parameter space, say soft terms few TeV, statistically insignificant; which allows in 
turn to consider a wide range for the soft parameters (up to the very Mx). In consequence, 
the results of our analysis are essentially independent on the upper limits of the MSSM 
parameters, in contrast with previous studies. 
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2.2 Treatment of nuisance variables and Yukawa couplings 



It is common in statistical problems that not all the parameters that define the system are 
of the same interest. In the problem at hand we are interested in determining the probabil- 
ity maps for the MSSM parameters that describe the new physics, i.e. {m, M, A, B, //}, but 
not (or not at the same level) for the SM-like parameters, denoted by {s}. However, the 
nuisance parameters {s} play an important role in extracting experimental consequences 
from the MSSM. The usual technique to eliminate nuisance parameters is simply marginal- 
izing them, i.e. integrating the pdf (]2.3|) in the {s} variables (for a review see ref. |Ts|l). 
When the value of a nuisance parameter is in one-to-one correspondence to a high-quality 
experimental piece of information (included in £ ves t), this integration simply selects the 
"experimental" value of the nuisance parameter, which thus becomes (basically) a con- 
stant with no further statistical significance in the analysis. Note that in that case the 
prior on such nuisance parameter becomes irrelevant. In the MSSM, nuisance parameters 
of this class are the gauge couplings, {#3, <?,</}, which thus can be extracted from the 
analysis (for an extended discussion see [pj|]). 

In the pure SM a similar argument can be used to eliminate the Yukawa couplings, 
since they are in one-to-one correspondence to the quark and lepton masses. However, in 
the MSSM these masses depend also on the relative value of the two expectation values of 
the two Higsses, i.e. tan/3 = V2/V1. At tree-level, the up-type-quark, down-type-quark 
and charged lepton masses go like m u ~ y u v sin /3, md ~ yd,v cos j3 and m e ~ y e v cos (5 
respectively; where v 2 = 2(vf + v 2 ) = (246 GeV) 2 is proportional to the Z mass squared. 
Note that tan f3 is a derived quantity, which is obtained upon minimization of the scalar 
potential V(Hi, H2), and thus takes different values at different points of the MSSM pa- 
rameter space. This means that two viable MSSM models (with the same fermion masses) 
will have in general very different values of the Yukawa couplings, and thus the theoretical 
prior, p(y), will play a relevant and non-ignorable role in their relative probability. Any 
Bayesian analysis of the MSSM amounts to an explicit or implicit assumption about the 
prior in the Yukawa couplings. 

In previous Bayesian analyses of the MSSM the role of the Yukawa couplings was 
basically ignored: their values were just taken as needed to reproduce the experimental 
fermion masses, within uncertainties. Even if this procedure can be seen as "sensible", it 



is worth wondering which kind of prior p(y) corresponds to. As shown in ref. [18] this can 
be worked out by marginalizing the Yukawa couplings, using the experimental information 
about the fermion masses. Let us discuss briefly see how this works. 

Just for the sake of the argument, let us approximate the associated likelihood as a 
product of Dirac deltas 

Abrmion masses = S(m t ~ m° XP ) 5(m b - m^ XP ) .... (2.4) 

and the fermion masses by the tree-level expressions 

m t = -j=y^ w vs/3, m b = —y^vcp, etc. (2.5) 
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where sp = sin /3, cp = cos j3 and yf w are the low-energy Yukawa couplings. Suppose 
further that y] ow = RiUi, where y\ are the high-energy Yukawa couplings (and thus the 
input parameters) and the renormalization-group (RG) factor Ri does not depend on yi 
itself. Now, it is easy to work out the factor introduced in the pdf when the j/j variables 
are marginalized: 



[dy t dy b ---] p{y, m, M, A, B\ data) = / [dy t dy b ---] p(y)6(m t - mf 9 ) S(m b - m^ xp 5 



p{y) 



dy 



drru 



dy b 



dm b 



p(y)s/c/--- (2.6) 



where p(y) denotes the prior in the Yukawa couplings (which we assume that factorizes 
from the other priors). Eq. ( |2.6[ ) represents the footprint of the Yukawa couplings in the 
pdf. Now, taking logarithmically flat priors in the Yukawas, i.e. p(y<i) oc then the 

Sp 1 Cp 1 ■ ■ ■ factors get cancelled. This is therefore the prior implicitly assumed in the 
previous Bayesian analyses, and the one we will adopt in this paper. Remarkably, for 
independent reasons, we find the logarithmically flat prior for Yukawa couplings a most 
sensible choice. Certainly there is no convincing origin for the experimental pattern of 
fermion masses, and thus of Yukawa couplings. However it is a fact that these come in 
very assorted orders of magnitude (from 0(1O -6 ) for the electron to 0(1) for the top), 
suggesting that the underlying mechanism may produce Yukawa couplings of different 
orders with similar efficiency; and this is the meaning of a logarithmic prior. 

Of course the above discussion is oversimplified. The physical (pole) masses include 
radiative corrections. Besides, the RG factor Ri for the top Yukawa coupling has an 
important dependence on the Yukawa itself. These subtleties have been incorporated to 
the full analysis. 



2.3 Variables for the MSSM scan 

Although our initial set of variables is {m, M, A, B, p, s}, and this is the one on which 
we have to set our theoretical prior, for the purposes of scanning the MSSM parameter 
space it is much more convenient to trade some of them by other parameters with more 
direct phenomenological significance. We have already seen that it is worth to trade p 
by Mz, which is automatically integrated out. Similarly, we have seen that the Yukawa 
couplings are nuisance variables that are profitably traded by the physical fermion masses 
and easily integrated out. Besides all this, it is highly advantageous to trade the initial 
B— parameter by the derived tan/3 parameter. The main reason is that for a given viable 
choice of {m, M, A, tan /3}, there are exactly two values of p (with opposite sign and the 
same absolute value at low energy) leading to the correct value of Mz- Thus working in 
one of the two (positive and negative) branches of p, each point in the {m, M, A, tan /3} 
space corresponds exactly to one model, whereas a point in the {m, M, A, B} space may 
correspond to several models, introducing a conceptual and technical complication in the 
analysis. 
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Consequently we should compute the whole Jacobian, J, of the transformation 



{H,yt, B} -> {M z ,m t ,t}, t = tan f3. (2.7) 

Then the effective prior in the new variables becomes 

p e s(gi,m t ,m,M,A,tanf3) = J\ l _ l=/J , z p(g i ,y t ,m,M,A,B,fj l = fj, z ) (2.8) 

where we have already marginalized Mz using the associated likelihood ~ S(Mz — M^ xp ) 
(recall that \xz is the value of /i that reproduces the experimental Mz-) In eqs.(^7], |2.8[ ) 
we have made explicit the dependence just on the top Yukawa coupling and mass, but for 
other fermions goes the same. 

So, to prepare the scan in the new variables we need an explicit evaluation of the 
Jacobian factor. For that we must know the dependence of the old variables on the new 
ones. This dependence is extracted from the minimization equations of the Higgs scalar 
potential, V{H\,H2) (which connect {/i, B} with {M^,tan/3}), and from the relation be- 
tween the yukawa couplings and fermion masses. This dependences, even when radiative 
corrections are included, have the form 

fi = f(M z ,y,t), y = g(M z ,m t ,t), B = h(ji,y,t), (2.9) 

where f,g,h are well defined functions, for which we give approximate analytical expres- 
sions below. Here we have made explicit only the dependence on the variables involved in 



the change of variables (2/7). Note that y depends on Mz since v tx Mz- In consequence 



df dg dh 

J = dAhd^ t m • ( 0) 

where the factor df /dMz carries essentially the fine-tuning penalization discussed in sub- 
sect. 2.1. 

For the numerical analysis we have evaluated J using the SoftSusy code |2(| which 
implements the full one-loop contributions and leading two-loop terms to the tadpoles for 
the electroweak breaking conditions with parameters running at two-loops. This essentially 
corresponds to the next-to-leading log approximation. 

However, it is possible to give an analytical and quite accurate expression of J by 
working with the tree-level potential with parameters running at one-loop (i.e. essentially 
the leading log approximation). Then 

/^low — 12 I ( Z - Ll 



Blow = ^T-(rn 2 Hl + m 2 H2 + 2 M f ow ) (2.12) 



2/iow = ■ (2.13) 

V S[J 
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Here the "low" subscript indicates that the quantity is evaluated at low scale (more pre- 
cisely, at a representative supersymmetric mass, such as the geometric average of the stop 
masses). The soft masses w/jj. are also understood at low scale. For notational simplicity, 
we have dropped the subscript t from the Yukawa coupling. Note that all these low-energy 
quantities contain an implicit dependence on the top Yukawa coupling through the corre- 
sponding RG equations. Besides, 

Wow = R„(V)H, B low = B + A RG B(y), y low ~ {Ql °^ , (2.14) 

1 + 6yF(Qi ow ) 

where R^(y), A R GB(y) are definite functions of y; Q is the renormalization scale, F = 
fq^T E\nQ, and E{Q) is a definite function that depends just on the gauge couplings 
P^f . Plugging eqs, (|2.11| - |2.14 ) into eqs. (|2.9| , 2. 10|) it is straightforward to get an explicit 



expression for J. Plugging the latter back into eq. fl2.8| ) we get an approximate form for 
the effective prior 



p e s(mt,m, M, A, tan/3) oc 



E 



y t 2 -i B\ 



ow 



yiow t(i + t 2 ) n z 



p(m,M,A,B, f i = n z ) , (2.15) 



where we have taken a logarithmically flat prior for the Yukawa couplings (i.e. p(yi) oc y^ 1 ), 
as discussed above. This is the prior to be used when the MSSM parameter space is scanned 
in the usual variables {m, M, A, tan /3)} and \x is taken as required to reproduce the correct 
EW breaking (the information about M^ xp is thus automatically incorporated). Let us 
stress that its form stems just from the relation between the initial variables and the 
phenomenological ones, indicated in eq. fl2.7|) , and it is not "subjective" at all. Besides, 
the prefactor in the r.h.s. of eq.( 2.15| ) (which is essentially the Jacobian) is valid for any 



MSSM, not just the CMSSM. The subjectivity lies in the p(m, M, A, B, fj,) piece, i.e. the 
prior in the initial parameters, for which we have still to make a choice. Furthermore, 
the prefactor in eq.( 2.15j ) contains the above-discussed penalization of fine-tuned regions, 



something that may be not so obvious, but that will become clear in sect. 3. Finally, the 
form of the prefactor implies an effective penalization of large tan /3, reflecting the smaller 
statistical weight of this possibility. Actually, the implicit fine-tuning associated to a large 
tan/3 was already noted in ref. p2| , p3j ], where it was estimated to be of order l/tan/3, in 
agreement with eq.(2.15). This is logical. From eq.(2.12) we see that 

1 Wow-^IOW /rv i n \ 

t; = — o o o— (2-loJ 

tan/3 m 2 Hl +m 2 H2 +2i4 m 

The denominator of this expression has the size of the typical soft terms (which we will 
call Ms). Therefore a large tan (3 requires abnormally small /j>i ow B\ OVi . As a matter of fact, 
/x cannot be very small, otherwise the mass of the lightest chargino would be below the 
experimental limit. Therefore tan/3 requires very small B\ ow . But this cannot be naturally 
arranged since the radiative contributions to B (i.e. its RG evolution from high to low 
scale) are sizeable (of order Ms) (l^]. Thus small B\ ow requires a tuning between its initial 
(high scale) value and the radiative corrections. 
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2.4 Priors in the initial parameters 



The choice of the prior in the initial parameters, {m, M, A, B, fi} must reflect our knowledge 
about them, before consideration of the experimental data (to be included in the likelihood 
piece). In our case, we have already made some non-trivial, though quite reasonable, 
assumptions about them, namely the hypothesis of universality of the soft terms (which 
is supported by the strong constraints from FCNC processes) at a very high scale (this 
restricts the analysis to scenarios where the transfer of SUSY breaking is suppressed by a 
high scale, as happens e.g. in models with gravity- mediated SUSY breaking). 

To go further we must consider the dynamical origin of the parameters. Four of them, 
{m, M, A, B}, are soft SUSY-breaking parameters. They typically go like ~ F/A, where F 
is the SUSY breaking scale, which corresponds to the dominant VEV among the auxiliary 
fields in the SUSY breaking sector (it can be an F— term or a D— term) and A is the 
messenger scale, associated to the interactions that transmit the breaking to the observable 
sector. Since the soft-breaking terms share a common origin it is logical to assume that 
their sizes are also similar. Of course, there are several contributions to a particular soft 
term, which depend on the details of the superpotential, the Kahler potential and the gauge 
kinetic function of the complete theory (see e.g. ref. p4|). So, it is reasonable to assume 
that a particular soft term can get any value (with essentially flat probability) of the order 
of the typical size of the soft terms or below it. There are special cases, like split SUSY 
scenarios, where the soft terms can be classified in two groups that feel differently the 
breaking of SUSY. In those instances, the priors should also be considered in two separate 
groups. But those cases are out of the scope of the present analysis, which is focussed on 
the simplest, most conventional and less baroque framework, which consists of a common 
SUSY breaking origin and transmission for all the soft terms. The \i— parameter is not a 
soft term, but a parameter of the superpotential. However, it is desirable that its size is 
related (e.g. through the Giudice-Masiero mechanism p5j) to the SUSY breaking scale. 
Otherwise, one has to face the so-called fi— problem, i.e. why should be the size of \i similar 
to the soft terms', as is required for a correct electroweak breaking (see eq.( 2.1ip ). Thus, 



concerning the prior, we can consider //ona similar foot to the other soft terms. 

Now, we are going to make the previous discussion more quantitative. Let us call Mg 
the typical size of the soft terms in the observable sector, Ms ~ F/A. Then, we define the 
ranges of variation of the initial parameters as 

- qM s < B < qM s 
-qM s < A < qM s 

< m < qM s 

< M < qM s 

< H < qM s (2.17) 

where q is an 0(1) factor. We have considered here the branch of positive [i. For the 
negative one we simply replace fi — > —fi. We have taken the same q for all the parameters, 
since we find no reason to make distinctions among them. Note that we can take q = 1 
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with no loss of generality, provided Mg is allowed to vary in the range < Mg < oo. In 
practice, to avoid divergences in the priors, we have to take a finite range for Ms, say 

M° s < M s < M x , Mg ~ 10 GeV (2.18) 

Nevertheless, the values of the upper and lower limits of the Ms range are going to be 
irrelevant, as it will become clear soon. Consequently, we can still take q = 1. 

We have discussed the ranges of the parameters, but not the shape of the priors. As 
already stated, we find reasonable to assume (conveniently normalized) flat priors for the 
soft parameters inside the ranges ( |2.17 ), i.e. 



p(m) = p(M) = p(p) = ^- , p (A)=p(B) = ^- (2.19) 

Still we have to decide what is the prior in Ms, and it is at this point where we have to 
take the decision of assuming a flat or logarithmic prior in the scale of SUSY breaking. We 
have considered the two possibilities throughout the paper. The comparison of the results 
from both choices will give us a measure of the prior-dependence of the analysis. 

Logarithmic prior 

Let us start assuming a logarithmic prior in Ms, which we consider the most reasonable 
option, since it amounts to consider all the possible orders of magnitude of the SUSY 
breaking in the observable sector on the same foot (this occurs e.g. in conventional SUSY 
breaking by gaugino condensation in a hidden sector). Then, 



p{]fs) =Nm *Ms~ ' l ' 2:2i)) 



where Nm s is a normalization constant, which turns out to be completely irrelevant. Now, 
we can marginalize Ms, which thus disappears completely from the subsequent analysis, 
leaving a prior which depends just on the {m, M, A, B, [i\ parameters 2 : 

f' Mx 1 
p(m, M,A,B,fi) = — j-^- / — « dM s 



4 Jiaax{m,M,\A\,\B\,ij,,M%} Mg 



N M 



s 



20 



1 



[max{m,M, \A\,\B\,n, M°}] 5 M\ 



20 [max{m,M, \A\, \B\,(i,M%}] 5 



(2.21) 



Of course, the prefactor is just an irrelevant normalization constant. Note that we have 
neglected the 1/M| term, which simply forced the prior to strictly vanish in the Mx limit. 
This effect is only appreciable when one of the parameters is close to Mx, otherwise it is 
completely negligible (note the fifth power in the denominators). On the other hand, as 



2 This procedure is a "hierarchical Bayesian technique", first used in ref. j|, but using complicated 
functions that were not possible to integrate analytically. 
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mentioned in subsect. 2.1 and will become clear soon, once the EW breaking is incorporated 
to the analysis, regions of very large initial parameters become irrelevant. In consequence, 



eq.(2.21) is an excellent approximation. Note that the value of Mx disappears from the 
analysis. The value of the lower limit on Ms, i.e. Mo, is also irrelevant. Note that its 



presence in the denominator of eq.(2.21) avoids the prior to diverge when the parameters 



are very small. This "regulating" effect is only felt when all the parameters are below 
Mo. However, we know that this region will be killed by the experimental data once they 
are taken into account (through the likelihood piece in the pdf). E.g. the upper bounds 
on chargino masses require \fi\ > 100 GeV. Hence, the value of Mg plays no relevant role, 



apart from the formal regularization of the prior. Let us recall that the above prior ( 2.21 ) 
is the one to be plugged in eq. (|2.8| ) (or in the approximated expression (|2~l5|) ) to get the 
effective prior in the scan parameters. 

It is funny to compare the prior of eq.( |2.21| ) with a "more conventional" logarithmic 
prior, i.e. p(m, M, A, B, fi) oc l/(m, M, A, B, [i). First of all, the "more conventional" prior 
is not regulated unless one imposes that the parameters should not go below some low-scale 
(or that the prior does not behave logarithmically flat in that region). But then the results 
are sensitive to the cut-off scale chosen. Note that the prior for phenomenologically viable 
points, with e.g. very small A and large /i (thus avoiding the constraints from chargino 
masses), will depend on the precise treatment of this region. Apart from this annoyance, the 
conventional logarithmic prior treats the parameters as uncorrelated objects. This produces 
non-realistic distortions. E.g. a point of the parameter space where some parameters are 
very large, but the others are very small, can have a value of the prior (i.e. an assigned 
probability) larger than another point where all the parameters are O(TeV). However, this 
goes against the expectative that all the initial parameters are likely to have similar sizes, 
as they share a common dynamical origin. In other words, it is not sensible to increase 
the prior probability (in a very significant amount) just because one of the parameters 
is abnormally small, compared to the others. These problems are nicely avoided by the 



simple prior ( 2.21 ), reflecting the way it has been constructed. 



To finish this discussion, let us note that the prior (2.21) does not have the form of a 



product of individual priors defined for each parameter. Still, we can get the form of the 
prior for just one parameter, marginalizing the others before including any experimental 
information. For instance, the prior in the gaugino mass, M, is obtained by marginalizing 
in m, A, B, [i, leading to 

V(M) <x ^ prr (2.22) 

V ; max{M,M°} V ; 

where we have neglected ~ 1/Mj contributions. This has indeed the form of a logarithmi- 
cally flat prior. Of course, similar individual priors are obtained for the other parameters. 

Flat prior 

We can now repeat the previous analysis, assuming a flat prior for Ms, which amounts 
to consider all the values of the SUSY breaking in the observable sector on the same 
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foot. Hence we maintain the ranges for the parameters, eqs. ( |2.17 ), (|2.18| ), and the flat 
priors inside those ranges, eq. ( |2,19| ). We just replace the logarithmically flat prior in Mg, 
eq. ( p. 20 ), by a flat prior 



p(M s ) = N Ms 



(2.23) 



where Nm s is an irrelevant normalization constant ~ 1/Mx- Again, to obtain the prior in 
the {m, M, A, B, fi} variables, we marginalize in Mg. The previous result ( p. 21 ) becomes 
now 



p(m,M,A,B,fj,)cc (2.24) 

[maxjm, M, \A\, \B\, /i, Mg jj 4 

where, once more (and for the same reasons) we have neglected a 1/M X contribution. The 
difference with eq.( |2.21| ) is that now we have one power less in the denominator. Again, 



the prior ( 2.24j) is the one to be plugged back into eq.(p^q) in order to get the effective prior 



in the scan parameters. 

We can also repeat the exercise of obtaining the prior for an individual parameter, say 
M, by marginalizing the others. In this case, the previous equation fl2.22| ) becomes 

P(M)~ln f!* MOl , (2.25) 

max{M, Mg\ 

In essence this is a flat prior in M, as it does not change much along orders of magnitude. 
E.g. in the 100 GeV < M < 4 TeV range it just changes a 13%. Again, the other 
parameters go in a similar way. 



3. High-energy vs Low-energy regions and the EW breaking 



At first sight it may seem that the assumption of a logarithmic prior, see eqs.( |2.2i , 2.22j ) 



amounts to a strong preference for the low-energy region of the parameter space, i.e. for 
{m, M, A, B, fi} not far from the EW scale. However, this is not true. We may ask the 
following question: What is a priori the relative probability that a parameter, say M, lies 
in the low-energy (accessible to the LHC) region, 100 Gev < M < 2 TeV versus the chance 
that it lies at a higher scale, 2 TeV < M < Mx- Using eq.( p.22|) , it is clear that this relative 
probability is 

V(W0 GeV <M<2 TeV) _ J_ 

V(2Te\ <M< M x ) ~ 12 ( ' ' 

(in an obvious notation). I.e. in the initial set-up the most probable situation is that SUSY 
escapes LHC detection, even with logarithmic prior. Note that this is not so if one cuts-off 
the ranges of the parameters at a few TeV, as is very usually done, but we allow them to 
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vary all the way up to Mx- Of course, the situation is much more dramatic for a flat prior. 
Using eq. ( [2.25| ) we see that in that case 

P(100GeV<M<2TeV) 
P(2TeV < M < M x ) 

Hence, the flat-prior set-up assigns a negligible initial probability to LHC detection. 

Fortunately things change for better as soon as we incorporate the experimental infor- 
mation about the size of the EW breaking, i.e. M^ xp . We have already discussed in subsect. 
2.1 how M^ xp can be used to marginalize leaving a footprint of fine-tuning penalization. 
Now we are going to be more precise. We will take the effective prior in the scan variables, 
given by eq^^) or by the approximate expression ( |2.15| ). Recall that these expressions 



already incorporate the experimental information on Mz and the marginalization of fi. 

Now, we can evaluate once more the relative probability between the low- and high- 
energy regions (say for the M parameter again), but with this effective prior, i.e. incor- 
porating the information about the EW breaking scale. For the sake of clarity we present 
now an analytical discussion, with some approximations, that gives correctly the essential 
results and allows to show the physical reasons behind them. At the end we will present 
the numerical results. 

Hence, we have to marginalize the {m, A, tan /?} parameters, since the /x-parameter 
has already been marginalized. Let us first perform the integration in {m,A}. Note that 
for a given value M = Mi, and tan /3 fixed, only a portion of the {m, A} plane will be able 
to accommodate, by adjusting the /U-parameter, the required EW breaking. Let us call this 
region R\. Therefore the integration is only extended to R\ 

P(Mi,tan/3) ~ / dm dA p c g(m t , m, M, A, tan 0) (3.3) 



where p e g(mt,m, M, A, tan (3) is given by eq.( |2.8| ) or eq.( [2.15 ). We have to compare this 



probability with the one for a different gaugino mass, say M<i > M\ (we keep tan/? fixed). 
The expression for V(M2, tan 0) is completely analogous to eq. (|3.3[) . The only subtle point 
is what is the new allowed region, R2. An approximate way to determine it is the following. 
In almost all the parameter space the squared SUSY-parameters, {m 2 , M 2 ,A 2 , B 2 , /j> 2 } are 
much larger than M|. Therefore, the combination of them producing the correct value 
of Mz is almost identical to the one producing Mz = 0. So we can approximate R± and 
i?2 as the regions giving Mz = 0. Now, if we neglect for a moment RG effects, it is clear 
that for each point {m\,Ai\ G R\ there is another point {7712,^2} G R2, producing the 
same breaking, given by m 2 = j^rni, A 2 = jftAi, 112 = Iff Mi, B 2 = jfiBi. In other 
words, R2 ~ jjp-Ri- RG effects do not in principle modify this relation since they are 
proportional to the very soft terms. However, there is a residual effect: since the running 
goes from Mx (where the SUSY parameters are defined) until the scale where the EW 
breaking is evaluated (~ stop masses), there is logarithmic correction, oc log(M 2 /Mi), 
which would slightly modify the shape of i?2- On the other hand, there will be points in 
R2 that will go out of the allowed ranges of the parameters. I.e., R2 will be slightly smaller 



- 13 - 



than j^Ri. This means that we are slightly overestimating the weight of the high-scale 
parameter space, which is a conservative attitude for this discussion. Now we can express 
the integration in the R 2 region in terms of the R± one, 



P(M 2 , tan/3) ~ / dm 2 dA 2 p e s (m 2 , M 2 , A 2 , fi 2 , tan /3) 

= j dmi dA 1 {^^j p c ff("ii,Mi, Ai,/xi,tan/3) 



¥±\ P(M X , tan/3) , (3.4) 

Here have used the fact that, when we assume a logarithmic initial prior, the effective prior, 
Peg, scales as 

p eS (m 2 ,M 2 , A 2 ,/j, 2 , tan P) = ijf) Pcs{mi, Mi, Ai,^i, tan/3) . (3.5) 
This can be easily noticed from the approximate expression of p e s in eq.(2.15). This relation 



is exact, essentially, up to small RG-effects in the factor involved in ( 2,15| ), 

The last step is to marginalize tan (3. But this will not affect the relative probability 
we are interested in, since the factor obtained by integrating tan /3 is identical for M\ and 
for M 2 . Alternatively, we can leave tan/3 fixed at some arbitrary value. The important 
point is that the relative probability goes like 

7>(M 2 ,tan/3) ^ Mf 

P(Mi,tan/3) Mf ' [ ' ) 

In other words, 

P(M,tan/3) ~ ^ (3.7) 



This should be compared with the V(M) ~ jj behaviour obtained in eq.(2.22), when the 
experimental Mz was not taken into account. We see that the pdf in M has gained two 
powers of M in the denominator. This is the fine-tuning penalization that arises on its 
own from the Bayesian analysis. Now the relative probability of the low-energy (accessible 
to the LHC) region versus the probability of a higher scale becomes 

7>(100 GeV < M < 2 TeV) /2TeV\ 2 » 

"^10 (3-8) 



•p(2TeV < M < M x ) ~ V 100 GeV 



to be compared with eq.( |3.1| ) before including the EW breaking in the analysis. In con- 
clusion, once the EW breaking is correctly incorporated in the Bayesian analysis (but not 
before!), 99% of the probability lives in the low-energy (LHC-relevant) region of the param- 
eter space. Note that this is achieved without invoking other kinds of constraints (like Dark 
Matter or g-2 constraints) that are often used to set the scale of the soft terms not far from 
the EW scale. Hence, the main reason to believe that SUSY should be accessible at LHC 
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scales comes from the EW breaking itself, i.e. the original motivation for phenomenological 
SUSY. We find this result very satisfactory. Needless to say that for the other parameters 
the results go in a completely analogous way. 

We have discussed the impact of the EW breaking scale when a logarithmic prior is 
used. For a flat prior, it is straightforward to repeat the above discussion, taking into 
account that the prior p(m, M, A, B, //) [and thus Peg(m, M, A, /i, tan /3)] has one power of 
mass less in the denominator. Therefore at the end of the day we arrive at 

P(M,tan/3) ~ ± ( 3 - 9 ) 

to be compared with the almost flat behaviour before including the experimental EW 
information, see eq.(2.24). Consequently, the relative probability of the low-energy and 
high-energy regions of the parameter space, for a flat prior, becomes 

V(W0 GeV < M < 2 TeV) 2 TeV 

P(2TeV < M < M x ) 100 GeV V ' ' 

So, even assuming a flat prior for the typical size of the soft breaking terms, up to the 
Mx scale, we see that the EW breaking is sufficient to put 90% of the total probability in 
the LHC-interesting region. This contrasts strongly with previous analysis, and, again, we 
consider it very satisfactory. 

We have checked the previous arguments by performing the analysis in a numerical 
way. For the posterior samples we adopt the MultiNest [26| algorithm as implemented 
in the SuperBayeS code [27]. It is based on the framework of Nested Sampling, recently 
invented by Skilling (2^, |29|. MultiNest has been developed in such a way as to be an 
extremely efficient sampler even for likelihood functions defined over a parameter space 
of large dimensionality with a very complex structure as it is the case of the CMSSM. 
The main purpose of the Multinest is the computation of the Bayesian evidence and its 
uncertainty but it produces posterior inferences as a by-product. For the marginalization 
procedure we have used the above-discussed ranges for our priors, i.e. from to Mx for 
m, M and \A\. Besides, we have used 2 < tan/3 < 62. The lower limit comes from present 
bound on the Higgs mass [3C]. The upper one comes from imposing Yukawa couplings 
in the perturbative regime J3l|] . The precise value tan/3 < 62 has been chosen to allow 
an strict comparison with previous analyses in refs. [||, ||] However, the precise value of 
this upper bound turns out to be irrelevant, as the region of very large tan/3 is strongly 
suppressed (see the discussion after eq. ( 2.15| ) and ref. [j32f ). 

In Fig. 1 the red line shows the prior in M (upper panels) and m (lower panels) , when 
the other parameters are marginalized, using logarithmic (left panels) or flat (right panels) 
initial prior for the scale of SUSY breaking in the observable sector (Ms)- The blue bar 
distributions show the pdf once the EW breaking is incorporated in the analysis, i.e. the 
effective prior in the scan variables, p e fj, see eq. (|2.8| ) and the approximate form ( |2.15 ). The 
logarithmic scale in the horizontal axes allows to see that most of the probability, which 
initially lies in the high-energy region (M,m above the TeV scale), flows dramatically into 
the low-energy region once the EW breaking is considered. Actually, most of the probability 



- 15 - 




Cabrera, Casas & Ruiz de Austri (2009) Cabrera, Casas & Ruiz de Austri (2009) 




log 1() [m (GeV)] log 10 [m (GeV)] 

Figure 1: ID marginalized posterior probability distribution of the M and m parameters (upper 
and lower panels respectively) for logarithmic (left panels) and flat (right panels) priors in the /i > 
case, for a scan including the information about the EW breaking (M^ xp ). The red lines represent 
the marginalized prior. All given in arbitrary units. 



falls inside the LHC discovery reach (even with just 1 fb _1 [33, |3"4|). Quantitatively, the 
results are in good agreement with the previous discussion. Although the distribution of 
probability above 1 TeV is almost invisible in the plots (especially for log priors), it is 
actually different from zero and follows from the approximate law of eq.( |3.9|) . Notice also 
that, at this stage, all the points are equally "best-fit points" (even at extremely large 
M, m) , since they are equally in reproducing Mz , the only experimental information so far 
considered. 



Besides making the high-energy parameter space quite irrelevant, the EW breaking has 
another dramatic effect, which is visible in Fig. 1. Namely, the probability distributions 
(pdfs) based on a logarithmic or on a flat prior are quite similar, after the incorporation of 
the EW scale. That is, the favoured regions of the parameter space are quite independent 
of the choice of the prior. Normally, a behaviour of this kind is attributed to the fact that 
the data are powerful enough to select a region of the parameter space, so that the general 
expression of the pdf, eq.(l.l), is dominated by the likelihood piece. However this is not the 
case here. As a matter of fact, concerning the likelihood, there are points with arbitrary 
large parameters that are as good as the low-energy ones, since they correctly reproduce 
M^ xp , the only data so far considered. The low-energy region is preferred because it is 
statistically much more significant, as we have discussed above. But this is a Bayesian 
effect, non-existent in a frequentist analysis. Therefore the situation is very good from the 
Bayesian point of view: the results are quite independent from the type of prior, but to 
see the preferred regions we need the Bayesian procedures. 

To finish this section, let us note that the previous statistical argument supports low- 
energy supersymmetry breaking (in the observable sector), even in a landscape scenario. 
In other words, even if there were many more vacua with supersymmetry breaking at large 
scale, most of realistic vacua would correspond to low-energy supersymmetry breaking, for 
rather generic a-priori distributions of all possible vacua (for related work in this line see 
13! 



4. Experimental Constraints 



In this section we will incorporate all the relevant experimental information to the likelihood 
piece of the probability distribution (all but M^ xp , which has already been taken into 
account). This amounts to include many experimental observables and bounds, with their 
error bars, and to calculate the predictions for them in the MSSM. 

As originally demonstrated in [Q, ||], the values of the relevant SM-like parameters 
(nuisance parameters) can strongly influence some of the CMSSM predictions. For our 
analysis we take the set 

{M u m.imt) 1 ^, aUMz) 1 ^, a s (M z )^} , (4.1) 

where Mt is the pole top quark mass, while the other three parameters (the bottom mass, 
the electromagnetic and the strong coupling constants) are all evaluated in the MS scheme 
at the indicated scales. The constraints on the SM nuisance parameters are given in Table [l]. 

On the other hand, there are the experimental values of accelerator and cosmological 
observables, which are listed in Table |2| Instead of including all this information at once 
and show the results, we find more illustrative to do it in several steps. This will allow 
to show the effect of the various types of data on the probability distributions (which are 
sometimes opposite). On the other hand, not all the data are on the same foot of quality 
and reliability and it is convenient not to mix them from the beginning. 
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SM (nuisance) 
parameter 


Mean value Uncertainty 
\x a (exper.) 


iter. 


M t 

rrib(mb) MS 
a s {M z )^ 
l/a cm (M z ) MS 


172.6 GeV 1.4 GeV 
4.20 GeV 0.07 GeV 
0.1176 0.002 
127.955 0.03 


|33| 
|37] 
|37] 



Table 1: Experimental mean /u, and standard deviation a adopted for the likelihood function for 
SM (nuisance) parameters, assumed to be described by a Gaussian distribution. 



In order to avoid a proliferation of plots we examine first the positive [i branch. In 
the next section we will show the relevant plots and results for negative /i and perform a 
comparison of the relative probability of the two possibilities. 

4.1 EW and B-physics observables, and limits on particle masses 

We start by considering the most reliable and robust pieces of experimental information: 
EW and B(D)-physics observables and lower bounds on the masses of super symmetric 
particles and the Higgs mass. The complete list of the observables of this kind used in our 
analysis is given in Table 2 (all the entries except those concerning and dark matter 
constraints). 

To calculate the MSSM spectrum we use SoftSusy |2(|, where SUSY masses are com- 
puted at full one-loop level and the Higgs sector includes two-loop leading corrections |59| . 
We discard points suffering from unphysicalities: no self-consistent solutions to the RGEs, 
no EW breaking and tachyonic states. Furthermore, we require the neutralino to be the 
lightest super symmetric particle (LSP) in order to be an acceptable dark matter candidate. 
The latter condition might be relaxed, as discussed in subsect. 4.3 below. In our treatment 
of the radiative corrections to the electroweak observables My/ and sin 2 9 e R we include full 
two-loop and known higher order SM corrections as computed in ref . [ 40 ] , as well as gluonic 
two-loop MSSM corrections obtained in 41]. 

Roughly speaking, the MSSM parameter space is quite unconstrained by EW (LEP) 
observables, except for quite small values of the SUSY soft-terms (i.e. when the SUSY 



corrections are sizeable) [42, |43]| . This is logical. As it is well known, the MSSM is free 
from the Little Hierarchy problem, understood as the tension between LEP observables 
and the need of new physics at O(TeV) scales to avoid the hierarchy problem f44[| . This is 
because R-parity prevents from tree-level SUSY contributions to higher order SM operators. 
In consequence, unless supersymmetric masses are quite small, the effect of SUSY on LEP 
observables is not important. 

Concerning B-physics observables, the branching ratio for the B -4 X s ^f decay (the 



most important one) has been computed with the numerical code SusyBSG [45] using the 



full NLO QCD contributions, including the two-loop calculation of the gluino contributions 
presented in [46] and the results of [47] for the remaining non-QCD tan /3-enhanced con- 
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tributions. The supersymmetric contributions to b — > sj grow with decreasing masses of 
the supersymmetric particles and with increasing tan /3. For /i > they have the "wrong 
sign", so larger supersymmetric masses are preferred. However the SUSY contribution is 
never dramatic for masses around 1 TeV or larger. For the determination of AMs a we 
use expressions from ref. [48] which include dominant large ian/3-enhanced beyond-LO 



SUSY contributions from Higgs penguin diagrams. The other B(D)-physics observables 
summarized in Table 2 have been computed with the code Superlso (for details on the 
computation of the observables see and references therein). Both codes have been 
integrated into SuperBayes. 

Experimental bounds used in the analysis are indicated in the second part of Table 
2. These include bounds on supersymmetric masses (squarks, sleptons, gluinos, charginos, 
neutralinos) and the Higgs mass. In general, the constraints on supersymmetric masses 
tend obviously to cut off the region of the parameter space with too small values of m, M. 
On top of this, the bound on the Higgs mass is most relevant, and deserves special attention, 
as we are about to see. For details on how the likelihood is computed we refer to ref. Q. 

For the quantities for which positive measurements have been made (as listed in the 
upper part of Table Q), we assume a Gaussian likelihood function with a variance given 
by the sum of the theoretical and experimental variances, as motivated by eq. (3.3) in 
ref. |3]]. For the observables for which only lower or upper limits are available (as listed 
in the bottom part of Table Q) we use a smoothed-out version of the likelihood function 
that accounts for the theoretical error in the computation of the observable, see eq. (3.5) 
and fig. 1 in ref. Q. In particular, in applying a lower mass bound from LEP-II on the 
Higgs boson h we take into account its dependence on its coupling to the Z boson pairs 



C^, as described in detail in ref. |p^| . When Q — l, the LEP-II lower bound of 114.4 GeV 
(95% CL) |6l| applies. For arbitrary values of 0i> we apply the LEP-II 95% CL bounds 
on rrih, which we translate into the corresponding 95% CL bound in the (mh,Ch) plane. 
We then add a conservative theoretical uncertainty r(mft) = 3 GeV, following eq. (3.5) in 
ref. || . The best fit is then defined as the maximum value of the joint likelihood function. 

Fig. 2 (upper panels) show the pdf for the gaugino mass parameter, M, once all 
this experimental information is incorporated. Again, the left (right) panels correspond 
to a logarithmic (flat) initial prior for the scale of SUSY breaking in the observable sector 
(Ms). The reason to show the pdf of M is to facilitate the comparison with the analogous 
probability distribution before the inclusion of the new pieces of experimental information 
(Fig. 1). Clearly, the bulk of the probability is now pushed into the high-energy region. 
This effect is basically due to the Higgs mass bound. As discussed above, concerning the 
other observables, everything works fine, as long as SUSY is not at too low scale. On the 
other hand, it is well known that in the MSSM the tree-level Higgs mass is bounded from 
above by Mz, so radiative corrections (which grow logarithmically with the stop masses) 
are needed. 

It is possible to be more quantitative by considering the dominant 1-loop correction 
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Observable 



Mean value 



Uncertainties 
a (exper.) r (theor.) 



M w 
sin 2 9 eS 
a 



cxp x 10 1Q 



80.398 GeV 

0.23149 

11659208.9 



27 MeV 
17 x 10~ 5 
6.33 



15 MeV 
15 x 10~ 5 



6 afl x 10 10 (e+e - ) 


29.5 


8.8 


2.0 


5a^ x 10 10 (r) 


14.8 


8.2 


2.0 


AM Bs 


17.77 ps" 1 


0.12 ps" 1 


2.4 ps' 


BR(B -> X s7 ) x 10 4 


3.52 


0.33 


0.3 


BR(B u ^tu) 
BR(B u ^tv) sm 


1.28 


0.38 




A _ x 10 2 


3.6 


2.65 




BR(B^Dru) w . r2 
BR(B—¥Deu) X iU 


41.6 


12.8 


3.5 




1.004 


0.007 




Si2(D s -»• ri/) x 10 2 


5.7 


0.4 


0.2 


Si2(Z? s -»• //i/) x 10 3 


5.8 


0.4 


0.2 


O x /i 2 


0.1099 


0.0062 


o.i 



-1 



Limit (95% CL) 



t (theor.) 



BR(B S -> n + /i~) 



C 2 



other sparticle masses 



< 5.8 x 10~ 8 

> 114.4 GeV (SM-like Higgs) 

/K) 

> 375 GeV 

> 289 GeV 

As in table 4 of ref. 



14% 
3 GeV 
negligible 

5% 
5% 



Table 2: Summary of the observables used in the analysis to constrain the CMSSM parameter 
space. Upper part: Observables for which a positive measurement has been made. 5a^ — a^ xp — a^ M 
denotes the discrepancy between the experimental value and the SM prediction of the anomalous 
magnetic moment of the muon (g — 2) M . As explained in the text, for each quantity we use a 
likelihood function with mean fi and standard deviation s = y a 2 + r 2 , where a is the experimental 
uncertainty and r represents our estimate of the theoretical uncertainty. Lower part: Observables for 
which only limits currently exist. The likelihood function is given in ref. ||, including in particular 
a smearing out of experimental errors and limits to include an appropriate theoretical uncertainty 
in the observables. nth stands for the light Higgs mass while = g 2 (hZ Z)mssm/ g 2 {hZ Z)sm 1 
where g stands for the Higgs coupling to the Z and W gauge boson pairs. The references for the 
theoretical calculations are given in the text. 



1 63] to the theoretical upper bound on rah in the MSSM: 

q 4 TWf'i 

ml < Ml cos 2 2/3 + log + ... (4.2) 

where rat is the (running) top mass and Mr is an average of stop masses. Hence, for a 
given lower bound on the Higgs mass, m™ m , one needs 

Mr > e -2W2/3 e ( mr 762 GeV) 2 ^ ^ 
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Cabrera, Casas & Ruiz de Austri (2009) 




M (TeV) 



Cabrera, Casas & Ruiz de Austri (2009) 




M (TeV) 



Cabrera, Casas & Ruiz de Austri (2009) 



Cabrera, Casas & Ruiz de Austri (2009) 




Log priors 
CMSSM, n>0 




Flat priors 
CMSSM, n>0 



Figure 2: Upper panels show the ID marginalized posterior probability distribution of the M 
parameter for logarithmic (left panel) and flat (right panel) priors in the [i > case for a scan 
including SM nuisance parameters constraints, EW breaking (M^ xp ), collider limits on Higgs and 
superpartner masses, and EW and B(D)-physics observables. Lower panels show the same but 
imposing a bound for the Higgs mass of rrih > 120 GeV. The cross corresponds to the best-fit point, 
defined as the one with highest likelihood. 



Thus, an increase Am^ on the lower bound of the Higss mass squared approximately 
translates into a multiplicative factor for Mx : 

M ~ t -> Mi e A K/(62 GeV)2 ; (44) 

and a similar increase can be expected in the initial parameters m, M. 

To illustrate these facts, we have re-done the pdfs assuming a different value of the 
Higgs mass bound, say > 120 GeV. Of course this would correspond to the real sit- 
uation if the Higgs mass turns out finally to lie in this range. According to the previous 
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Figure 3: ID marginalized posterior probability distribution of the CMSSM parameters for loga- 
rithmic (left panels) and flat (right panels) priors in the /i > case for a scan including SM nuisance 
parameters constraints, EW breaking (AT^ xp ), collider limits on Higgs and superpartner masses, and 
EW and B(D)-physics observables. The cross corresponds to the best-fit point. 



argument, we can expect now a longer push of the probability distribution into the high- 
energy region. And this is what happens, as it is shown in Fig. 2 (lower panels). The effect 
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Parameter 


Mean value 


Best fit 


68% (95%) range 


m h (GeV) 
R- 1 (GeV) 


240.8 
589.3 




[235.5 : 246.8] ([222.6 : 251.3]) 
[511.5 : 668.6] ([467 : 781.9]) 



Table 3: Higgs mass mean value and best fit for logarithmic and flat priors, and the 68% and 95% 
Bayesian equal-tails credibility intervals. All numbers are given in GeV units. 



is very important, given the modest increase in the Higgs mass bound. Larger shifts in mh 
have an exponentially larger effect, as discussed above. So, if the MSSM is true and we 
wish to detect it at LHC, let us hope that is close to the present experimental limit 3 . 

Fig. 3 shows some representative probability distributions for individual (initial and 
derived) parameters, i.e. once all the rest are marginalized. The dimension-full parameters 
(m, A) follow a trend similar to that of the gaugino mass, M (which was already shown in 
Fig. 2). On the other hand, large values of tan f3 are penalized, mainly due to the Jacobian 
factor in the probability distribution, see eq.( 2.15| ) and the subsequent discussion. It is 
worth to remark that this penalization of tan f3 contrasts with other Bayesian analyses, 
where the prior for tan f3 was taken as flat. Here it arises from the above-mentioned Ja- 
cobian factor and therefore has nothing to do with a particular choice of priors. Fig. 4 
shows the probability distribution for the Higgs mass. One can see that there are a sig- 
nificant number of points which evade the LEP-II 114.4 GeV lower bound for the SM 
Higgs. This reflects the fact that we have employed the full likelihood function in the 
(rrih,(h) pl ane as described above and which allowes points with low Higgs masses where 
= sin 2 ((3 — a) <C 1. The corresponding Bayesian credibility intervals, representing the 
68% and 95% of the total probability, are given in Table ||. The central value for the 
Higgs mass is at 117-118 GeV. From that table one can see the robustness of the results 
under changes of the prior. Notice also the little discrepancy among the mean value of the 
posterior pdf and the best fit. 

Fig. 5 shows the probability distribution in the {M, m} and {tan f3, M} planes (i.e. 
when all the parameters but two are marginalized). The results of Figs. 3-5 are shown all 
for logarithmic (left panels) and flat (right panels) priors, exhibiting a remarkable stability, 
which has already been discussed. In the {M, m} plots we have shown also the discovery 
reach of LHC for 1 fb _1 and 100 fb _1 (with a center-of-mass energy of 14 TeV). These 
lines have taken from ref.|34]. They arise from a study of events with Nj e t s > 2 and an 
optimization of the cuts on E™ ssms . (For a more detailed explanation of the procedure used 
see [29]). Strictly speaking, the lines correspond to A = 0, tan/3 = 45, but they provide a 
good indication of the LHC discovery potential in the short and medium term (for similar 
analyses see [33]). Now, it is clear that a substantial (though still non-dominant) part 
of the probability falls out of the LHC reach, an effect that it is more important for flat 
prior. This means that if we are unlucky, supersymmetry could evade LHC detection in 



3 Certainly, it is well-known that a Higgs above 125 GeV is not easy to arrange in the MSSM, and that 
is at the origin of the difficulties. What the present analysis shows, in a more direct way, is how improbable 
is to arrange a large mi (see also Fig. 4 below) and the implications for the discovery of SUSY at the LHC. 
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Figure 4: As Fig. 3, for the Higgs mass. The small filled circle represents the mean value of the 
posterior pdf and the cross corresponds to the best-fit point. 



the short, or even the long, term. On top of this, let us recall that if the Higgs mass is not 
close to its present experimental value, the preferred regions of the parameter space are 
quickly pushed to high-energy (see discussion about Fig.2), thus jeopardizing the discovery 
of supersymmetry. 

4.2 Constraints from (g — 2) M 

The magnetic anomaly of the muon, = ^(g—2)^ has been a classical and powerful test for 
new physics. At present, the experimental uncertainties in the experimental and theoretical 
determinations are on the verge of strongly constraining, or even giving a positive signal, 
of new physics. However, the situation is still somewhat uncertain, due essentially to 
inconsistencies between alternative determinations of the SM hadronic contribution, more 
precisely the contribution coming from the hadronic vacuum polarization diagram, say 

This contribution can be expressed in terms of the total hadronic cross section e + e~ — > 
hadrons. Using direct experimental data for this cross section, one obtains a final result 
for a„, which is at more than 3a from the current experimental determination, namely 
5a^ = a^ xp — a^ M = 29.5 ± 8.8 x 1CU 10 . This has been often claimed as a signal of new 
physics. Obviously, if one accepts this point of view, the discrepancy should be cured by 
contributions of new physics, in our case MSSM contributions. The immediate implication 
is that super symmetric masses should be brought to quite small values, in order to produce 
a large enough contribution, 5 MSSM a At , to reconcile theory and experiment. Hence, SUSY 
should live at low-energy (accessible to LHC), mainly because of a^. This is an independent 
argument from the the one based on the size of the EW scale, which has been discussed in 
sect. 3. 

The previous statement is quite strong. History has taught us that many experimental 
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Figure 5: 2D marginalized posterior probability distribution for logarithmic (left panels) and flat 
(right panels) priors in the fi > case for a scan including SM nuisance parameters constraints, 
EW breaking (M^ xp ), collider limits on Higgs and superpartner masses, and EW and B(D)-physics 
observables. The inner and outer contours enclose respective 68% and 95% joint regions. The 
red(green) lines show discovery reach of LHC with 1(100) fb -1 . The cross corresponds to the 
best-fit point. 



observables, in apparent disagreement with the SM prediction, have eventually converged 
with it. This occurred due to both experimental and theoretical subtleties and difficulties, 
that sometimes had not been fully understood or taken into account. Although, obviously, 
is a most relevant test for the SM, and hopefully will be a first indication of physics 
beyond the SM, it is perhaps prudent not taking for granted that this is so indeed. As a 
matter of fact, the experimental e + e~ — > hadrons cross section shows some inconsistencies 
between different groups of experimental data (see [64] for a recent account). This is 
especially notorious if one considers the hadronic r— decay data, which are theoretically 
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Figure 6: Non-normalized ID marginalized posterior probability distribution of the m parameter 
for logarithmic prior and fi > including SM nuisance parameters constraints, EW breaking (A/^ xp ), 
EW observables, collider limits on Higgs and superpartner masses: + B(D)-physics observables (blue 
solid line); -I- a M (red dashed line); + B(D)-physics and (green dashed-dotted line). 



related to the e + e~ hadronic cross section. Using the r-data, the 3.3<7 disagreement 
becomes 1.8<t, i.e. one comes back to the SM realm. Although the more direct e + e~ data 
are usually preferred to evaluate a^ M , this discrepancy is warning us to be cautious about 
this procedure. 

To illustrate this situation, we have performed two alternative analyses. In the first 
one we use the evaluation of <5had a M based on e + e~ data. In the second, we use the one 
based on T-data. We compute <5had<V a ^ ^ un one-loop level adding the logarithmic piece of 
the quantum electro-dynamics two-loop calculation plus two-loop contributions from both 
stop-Higgs and chargino-stop/sbottom ||6q| . The effective two- loop effect due to a shift in 
the muon Yukawa coupling proportional to tan 2 /3 has been added as well J66[ . 

Using S^^a^ from e + e~ data 

In this case, the inclusion of the constraint has a dramatic effect, as mentioned 
above. The preferred values of the soft terms are pushed into the low-energy region. 
Actually, the push is so strong that the predictions for other observables, in particular 
b — > s 7, start to be too large. This tension has been pointed out in ref.|37|], and we 
would like to illustrate it here presenting some representative plots. Fig. 6 shows the (non- 
normalized) pdf for the m— parameter in three different cases (taking always a logarithmic 
prior): a) using EW + Bounds + B-physics, as in subsect 4.1 (blue solid line); b) using 
EW + Bounds + a M (red dashed line); c) using EW + Bounds + B-physics + a M (green 
dashed-dotted line). 

Clearly, the effect of just a M is to bring the preferred region for the soft terms from 
~ 1 TeV to ~ 300 GeV. This effect is remarkably stable against variations of the type of 
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Figure 7: Non-normalized ID marginalized posterior probability distribution for BR(B — > X s j) 
(left panel) and for S MSSM a^ (right panel). The code for the lines is as in Fig. 6. Besides, the black 
(solid) gaussians represent the experimental likelihood. 



prior, indicating that the data are now powerful enough to essentially select a region of the 
parameter space. Let us also mention that large values of tan/3 become now much more 
likely, being normally associated to the region of larger soft masses (recall that <5 MSSM a^ 
grows with decreasing masses and increasing tan/3). When both b — > s, 7 and are taken 
into account, there is almost no region of the parameter space able to reproduce both 
experimental results within 2a. Therefore the likelihood factor gets suppressed, and the 
"preferred" region of the parameter space (illustrated here by the green line) is somehow 
an average of the two previous cases. This tension between b — > a, 7 and can also be 
noticed by looking at Fig. 7, where the left and right panels show the pdfs of BR(B — > X s ^) 
and 6 respectively, with the same code for the lines as in Fig. 6. Besides, we have 

include a gaussian in each panel (solid black line), proportional to the likelihood, and 
thus centered at the experimental value with the experimental uncertainty. Comparing 
the position of the bulk of the probability distribution with the likelihood, it is clear that 
the most favourable cases are not really satisfactory reproducing the two measurements 
simultaneously, even though we have not attempted to quantify this tension in a rigorous 
way. 

Let us also remark that, if the Higgs mass turns out to be O(10) GeV above the present 
experimental limit, the tension between the Higgs mass and a^ xp would be dramatic and 
could not be reconciled: (a^ xp ) would require too large (small) soft masses, see the 
discussion in subsect. 4.1. 

Fig. 8 shows the probability distribution in the {M,m} and {tan/3,M} planes, as in 
Fig. 5, once the constraint (based on e + e~ data) is included. Comparison with Fig. 5 
clearly shows the big push of the soft terms into the low-energy region. Actually, most of 
the probability falls now within the LHC reach (even in the short term), which is great 
news for the potential discovery of SUSY (if the discrepancy is really there). 
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Using ^had a ^ from r data 

In this case, there is no big discrepancy between a^ M and a^ xp , so 5 does not 

need to be large. Consequently, the probability distributions are essentially unchanged by 
the inclusion of the constraint, and are very similar to those shown in subsect. 4.1 (Figs. 
2-5). 

Consequently, if is not a signal of new physics, the size of EW breaking continues to 
be the only piece of data that brings SUSY to scales accessible to LHC (apart from Dark 
Matter considerations, which we examine next). 

4.3 Constraints from Dark Matter 

There are different astrophysical and cosmological observations that offer impressive ev- 
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idence of the existence of Dark Matter (DM) in the universe (see Table 2 for a recent 
determination of ^dm)- On the other hand, the consistency with the observed large struc- 
ture of the universe favours cold dark matter (CDM), i.e. non-relativistic matter at the 
beginning of galaxy formation. This leads to the hypothesis of a weakly interacting massive 
particle (WIMP) as the component of CDM. 

Supersymmetry offers a good candidate for such a WIMP, namely the LSP, which is 
stable in the standard (R-parity conserving) SUSY formulations (for a review see [pq]). 
Although, depending on the models, there are several possibilities for the SUSY WIMP, 
the most popular and natural candidate is the lightest neutralino, x > which is the LSP 
in most of the CMSSM parameter space. However the calculations show that typically 
too many neutralinos are produced after inflation. Therefore some efficient annihilation 
mechanism is required in order to bring Odm down to the allowed range. In the context of 
CMSSM there are four such mechanisms known, which take place in four different regions 
of the parameter space: 

Bulk region: Neutralinos can be annihilated (into leptons) via sleptons if the masses of 
the latter are not high. This requires rather small m and M soft parameters, in potential 
conflict with the Higgs mass bound. 

Focus Point region: For moderate or large values of tan/3 the electroweak scale is quite 
insensitive to the variation of m. For large enough values of m, the \x parameter decreases, 
which drives the LSP to get a significant Higgsino-component, making its annihilations 
(into vector bosons) more efficient. 

Co- annihilation region: If the mass of the second lightest super symmetric particle (NLSP) 
is close to that of the LSP, the annihilation of the latter is enhanced through co-annihilation 
processes. In the CMSSM this mechanism takes place typically with an stau NLSP. In the 
parameter space this corresponds to a rather narrow region with M > m. 

Higgs funnel region: When the mass of the pseudoscalar j4°-boson becomes close to twice 
the neutralino LSP and tan /3 is large, the annihilation occurs quite efficiently through the 
resonance. 

In order to evaluate the viability of supersymmetric CDM in each point of the CMSSM 



parameter space, we use the MicrOMEGAs code [6S] integrated into SuperBayes. The corre- 
sponding likelihood, assuming that all the CDM is made up of neutralinos, is then incorpo- 
rated to the pdf in the Bayesian scan. Fig. 9 shows the resulting probability distribution in 
the {M,m} and {M, tan/3} planes (i.e. when all the parameters but two are marginalized) 
for logarithmic and flat priors. In these figures we have not included the information about 
a„. The {M, m}- plane plots show a kind of blurring with respect to usual plots in the 
literature, due to the integration in the variables A, tan /3. Still, the above-mentioned four 
viable regions are visible in Fig. 9. On the other hand, the {M, tan/3} plots show two big 
preferred regions. The largest one ocurrs at M < 1 TeV and contains (mixed) the Co- 
annihilation, Bulk and part of the Focus Point regions. The second one occurs at M > 1 
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Figure 9: As in Fig. 5 but with an additional constraint from the WMAP CDM abundance. 



TeV and corresponds to the part of the Focus Point region that needs moderate to large 
values of tan j3. Besides, the very small island around M = 200 GeV (also visible in the 
{M, m} plots) corresponds also to the Focus Point region. Finally, the Higgs funnel region, 
which becomes significant for very large values of tan /3, is located around tan /3 = 50. Let 
us remark that, since some of the previous regions require large tan f3, the latter becomes 
more probable than before-including CDM constraints. 

Although the favoured regions are qualitatively similar for logarithmic and flat priors, 
quantitatively the area of highest probability is extended into larger (even inaccessible to 
LHC) soft masses in the case of flat prior. This is because the DM constraints, though 
quite severe, do not select a unique region of the parameter space but several ones, located 
in different zones of the CMSSM parameter space, as discussed above. Consequently, the 
prior assumed for the parameter space plays a relevant role when comparing the relative 



- 30 - 



probability of these regions. 

Regarding the impact on the LHC potential of discovery, roughly speaking, including 
DM constraints the low-energy gets favoured and therefore the detection of SUSY at the 
LHC, as can be seen by comparing Figs. 5 and 9. However, there survive large (though less 
probable, especially for log prior) high-energy areas out of the LHC reach. Consequently, 
again, if we are unlucky, even if DM is supersymmetric, it could escape LHC detection 
(especially if the Higgs mass is not close to its experimental limit). 

In any case, again, one should be cautious at interpreting these results as a robust 
constraint on the CMSSM. Certainly, they are so with an "standard" cosmology. However, 
it could happen that other regions of the MSSM parameter space are cosmologically viable 
if, e.g. the overproduction of CDM is diluted by electroweak baryogenesis. Admittedly, 
the latter is not a most natural or popular scenario of inflation, but mechanisms for it have 
been explored |70|| . Alternatively, the LSP could be unstable assuming tiny violations of 
R-parity, see e.g. |ffl| . In these cases the observed dark matter should be provided by 
other candidate, e.g. an axion. But this is not a drawback for the model. Of course, CDM 
constraints are extremely interesting and they have to be taken into account. But it seems 
sensible not to put them at the same level as e.g. electroweak observables. 

Finally, Fig. 10 shows the {M, m}-plane plots when the a M constraint (based on e + e~ 
data) is incorporated to the analysis as well. Clearly, the regions with "too large" soft 
masses (to reproduce the a^ xp ) are now suppressed, leaving a quite definite region at 
low-energy. More precisely, the bulk and co-annihilation regions are now clearly selected 
amongst the various possibilities to obtain £Idm- We stress, however, that in this case 
one should be cautious about both the Qdm arid the constraints. Note in particular 
that, if the a M constraint is based on r data, it does not produce relevant restrictions and, 
consequently, the corresponding plots are quite similar to those of Fig. 9. 

5. Negative sign of fj, 

So far all the results and plots presented correspond to fi > 0. The analysis for [i < is 
completely similar. The most worth-mentioning difference is that with [i < the MSSM 
contributions to have negative sign and thus become useless to reconcile theory and 
experiment (a discrepancy that is only present if S^ L a /J , is evaluated using e + e~ — > had 
data). On the other hand, the contributions to b — > 5,7 have now positive sign, which is 
the "right" sign to push the theoretical result closer to the experimental value (see Table 
1). This effect, however, has less impact than in the distribution of probability. 

5.1 Results 

The results for [i < are summarized in Figs. 11, 12, 13 and 14, which are as previous 
Figs. 5,8,9 and 10, but with opposite sign of fi. 

Fig. 11 shows the posterior distribution function when only the most robust set of data 
(EW and B(D)-physics observables, and limits on particle masses) are taken into account. 
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Figure 10: As in Fig. 5 but with an additional constraint from the WMAP CDM abundance. 



Because of the above-mentioned b — > s, 7 observable, the distribution is now slightly shifted 
to smaller soft masses (now "it pays" to have a moderately sizeable SUSY contribution 
to this process), as it is clear from comparison with Fig. 5. The effect is welcome, as it 
pushes SUSY towards regions of the parameter space more accessible to LHC. However the 
impact is far from dramatic. 

Fig. 12 shows the posterior when (evaluated using e + e~ data) is included in the 
analysis. Now the difference with the analogue for positive \i (Fig. 8) is really dramatic. 
Recall that now the SUSY contributions to a M have the wrong sign, so it does not pay to 
have smaller soft masses. Consequently, Fig. 12 is similar to Fig. 11 (i.e. before including 
constraints) and, actually, the soft masses are pushed to slightly higher values. 

Fig. 13 shows the posterior when one considers the previous robust set of data (not 
including a») plus the constraints from Dark Matter. Since Dark Matter has a great 
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Figure 11: As in Fig. 5 but with fi < 0. 



potential to select preferred regions in the parameter space, the results are quite similar to 
those for \x > 0, Fig. 9; and the same comments hold here. 

Finally, Fig. 14 shows the posterior when all the experimental information, including 
a M (evaluated using e + e~ data) is taken into account. Similarly to our above discussion 
of Fig. 12, the results do not change much after the inclusion of the constraint. In 
consequence, Fig. 14 is quite similar to Fig. 13, with a certain penalization of too small 
soft masses. Again, this is in strong contrast with the [i > case, where the low-energy 
(bulk and co-annihilation) regions were preferred (see Fig. 10). 

5.2 Positive versus negative \i 

In order to compare the relative probability of the /x > and fi < branches, one has 
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Figure 12: As in Fig. 8 but with fx < 0. 



to evaluate the Bayesian evidences of both cases. We recall that the Bayesian evidence 
is the piece in the denominator of eq.( |l.l[) , i.e. p(data), sometimes called Z. Note that, 
integrating both sides of eq.(|l.l|), and using the fact that p(6?j|data) must be correctly 
normalized, one simply obtains 

Z = p(data) = y d6>i--- d9 N p(data\9i) p(6i) , (5.1) 

i.e. the evidence is the integral of the likelihood times the prior, and therefore it is a 
measure of the global probability of the model. Note that in doing parameter inference 
within a given model, Z plays the role of a normalization factor and can be (and it is) 
usually ignored. However, it is the key quantity to compare the relative probability of 
two different models. Let M±, M2 be two models with prior probabilities p(Mi), p(M2)- 



-34- 



Cabrera, Casas & Ruiz de Austri (2009) 



Cabrera, Casas & Ruiz de Austri (2009) 



— 4 
> 

CD 

O 

t= 3 



D) 

3 2 



1 



Kb" 1 

100 ftr 1 



4* 



Log priors 
CMSSM, n<0 




2 3 

log 1Q [M (GeV)] 



Flat priors 
CMSSM, n<0 




'2 3 

log 1Q [M (GeV)] 



Cabrera, Casas & Ruiz de Austri (2009) 



Cabrera, Casas & Ruiz de Austri (2009) 



Log priors 
CMSSM, n<0 




log 10 [M (GeV)] 



Flat priors 
CMSSM,. ^<0 




3 4 
log 1Q [M (GeV)] 



Figure 13: As in Fig. 9 but with fi < 0. 



Then the relative posterior probability of the two models, given a set of data, is simply 

pjM^data) _ gi gQMi) _ gQMi) 
p(7W 2 |(iata) ^ 2 p(A^ 2 ) 12 p(A^ 2 )' 

where -B12 = Z1/Z2 is called the Bayes factor and p(A / li)/p(A^2) is the prior factor, often 
set to unity. 

The natural logarithm of the Bayes factor provides a useful indication of the different 
performance of two models. In Table we summarize the translation of the Bayes factor 
to relative probabilities and a conventional interpretation of them , which we follow in 
this paper. 

The evaluation of the Bayesian evidence is in general a numerically challenging task, 
as it involves a multidimensional integral over the whole parameter space. In addition 
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I In Bio | 


Odds 


Probability 


Strength of evidence 


< 1.0 


<3:1 


< 0.750 


Inconclusive 


1.0 


~ 3 : 1 


0.750 


Weak Evidence 


2.5 


~ 12 : 1 


0.923 


Moderate Evidence 


5.0 


~ 150 : 1 


0.993 


Strong Evidence 



Table 4: The scale we use for the interpretation of model probabilities. 



the likelihood is often multi-modal, or it has strong degeneracies that confine the poste- 
rior to thin sheets in parameter space. Standard techniques as thermodynamic integration 
1 73] have been proposed for a reliably estimation of the evidence, but they are extremely 
computationally-expensive. Perhaps the most elegant algorithm proposed up to now is the 
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Observables 


In (flat) 


In (log) 


EW + Bounds + B-physics 
EW + Bounds + B-physics + (g - 2) M 
EW + Bounds + B-physics + ^dm 
EW + Bounds + B-physics + (g — 2) M + SIdm 


-0.32 ± 0.048 
0.81 ±0.043 

-0.31 ± 0.068 
1.9 ±0.065 


-0.48 ±0.049 
1.73 ±0.052 

-0.66 ±0.066 
3.71 ± 0.068 



Table 5: The natural log of the Bayes factor (In £> H ) for /i > and /i < 0. A positive (negative) 

value indicates a preference for \i > {p, < 0). 



nested sampling as referred to in Sec. 3. This technique has greatly reduced the computa- 
tional cost of model selection. 

Let us now come back to our task, namely comparing the evidences for the positive 
and negative [i branches, considered here as different models with equal prior probabilities. 
We have applied the MultiNest algorithm to obtain the Bayes factor of these two models, 

= Z+/Z-. Of course the results depend on the experimental information considered, 

which enters the likelihood piece in eq. fl5.ip . The results are given in Table §. 

The first column of Table [| indicates the set of experimental data taken into account, 
the notation is self-explanatory and corresponds to the different cases previously defined. 
The discussion of subsect. 5.1 allows to understand the numbers of the table. When only 
the most robust pieces of experimental information are used (first row), the performance 
of both models is similar. The fi < branch is slightly favoured, due to its capability to 
reproduce the central value of b — > s,7, but the effect is not really significant, as is shown 

by a value of | In | well below 0.75, see Table This holds when Qdm constraints 

are incorporated into the analysis (third row of Table ||). On the other hand, (g — 2)^ 
constraints (when evaluated using e + e~ — >had data) clearly favour the fi > branch, as 
discussed above, which is reflected in the numbers of the second and fourth rows of Table |5[ 
Using the conventions of Table [|, we see that the global evidence in favour of positive [i 
is weak-to- moderate (not strong but already significant). Note that this effect is stronger 
for log prior, since in that case the high-energy region (the preferred one for [i < 0) gets 
an additional penalization. Likewise, when £Idm constraints are included (at the same 
time as (g — 2) M ), the preference for positive fi gets even stronger. This is because, Qdm 
constraints favours (in terms of statistical weight) the low-energy region of the parameter 
space, and this is the region strongly preferred (penalized) by a M constraints for positive 
(negative) [i. 



6. Comparison to previous work 

Some of the previous work in this subject has been collected in refs. |Q, ||, ||], [|, [|, ||, 0, §| 
p|, [l0| All of them are Bayesian analyses, except refs.[||, 1C]. However, a fair comparison 
with our work is tricky, since these articles often make assumptions very different from 
us about the priors and ranges of the initial parameters (and even about which are the 
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initial parameters). Also they may include different pieces of experimental information. 
The last point is dramatic regarding {g — 2)„, as is clear from subsect. 4.2. Nevertheless 
it is interesting to compare our work with this previous literature, to make clearer how 
all these differences do affect the results and conclusions. For the sake of concreteness, we 
have considered five previous representative works, corresponding to refs.[[j], ^, ||, ||, [H|. 

In ref.|jj], which was pioneering in MSSM Bayesian analyses, tan/3 was considered as 
an initial parameter, with flat prior. As a result, there is no penalization of the large tan /3 
region, which thus becomes even favoured by experimental data (probably because of Dark 
Matter constraints, see below). Besides, the authors include always the experimental data 
concerning (g — 2)^ (based on e + e~ data) and Dark Matter constraints. Finally, the priors 
for the soft terms are taken as flat, with ranges bound by 2 TeV. Hence their Fig. 2 would 
correspond to our Fig. 10 (they are based on essentially the same experimental data). 
Actually, the {M, m} plots of the two figures are not very different, although ours favour 
more clearly the low-energy region (due to the incorporation of the electroweak scale, as 
discussed in sect. 2). This effect would have been more impressive if in ref.|l[ they had 
unplugged the (g — 2) M and Dark Matter data. And much more if, besides, they had 
widened the allowed range of the soft terms. On the other hand, their {M , tan /3} plots 
favour more clearly the region of very large tan /3 (Higgs Funnel region). In our opinion this 
effect is not realistic, since tan /3 is clearly a derived parameter, and this fact introduces a 
Jacobian factor in the associated probability distribution, penalizing large tan/3. 

In ref.Q the initial assuptions were similar to those of ref . JlJ (and the results are 
consistent with each other). Therefore the comparison with our results is also similar. In 
this case, however, the authors tried two different classes of ranges for the soft parameters 
(up to 2 TeV and 4 TeV) and, also, they probed to disconnect {g — 2)^. From their Fig. 
16, it is clear that, by unplugging {g — 2) M , the preferred region for M goes from 0.5-1 TeV 
to 1—1.5 TeV. Comparing to our Fig. 9 (which is now the corresponding one), we see that 
in our analysis the high-energy region is more penalized, which is not surprising. 

In ref.Q], a refined version of the analysis of ref. Q was presented. In this case, tan/3 
was considered a derived parameter (which introduces a Jacobian factor). Also, Mz was 
marginalized, as in our case (for a detailed comparison between the two procedures see 
ref. |^8|). Therefore, the initial set up of ref. is the most similar one to ours. Their 
priors, however, are quite different and somewhat arbitrary (though reasonable). They 
would correspond more or less to our logarithmic priors, allowing very large ranges for the 
parameters. In their results the authors observed indeed a penalization of the high-energy 
region, which they attributed to the choice of the priors. We think, however, that it is 
mainly a consequence of the marginalization of Mz, and the effective penalization of fine- 
tuning that it entails (something that is far from obvious at first sight). In their Fig. 3 they 
compare their results with those of ref.[|l]]. There one can clearly see the extra penalization 
of the high-energy region. The {M, m} and {M, tan/3} plots of that figure correspond to 
the (log prior) plots of our Fig. 10. Indeed, both figures are quite consistent (theirs are 
even more tilted towards low energy, probably due to the additional effect of their choice 
of priors). Unfortunately, they do not explore unplugging (g — 2) M and Dark Matter data, 
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so a comparison with other results and plots of our paper is not possible. 

In ref.||] a Bayesian analysis of the so-called pMSSM (" phenomenological MSSM") 
was presented. This model has many initial parameters (~ 20), all of them defined at 
low-energy. Apart from that, the set up of the analysis was similar to that of ref.pl]. In 
particular they took tan f3 as an initial parameter, and considered flat priors and finite 
ranges for the soft parameters (< 4TeV), including (g — 2)^ and Dark Matter experimental 
data in all instances. In order to make any comparison with our work, one has to focus 
on particular quantities. A good example is the gluino mass, M g , which for mSUGRA is 
~ 2.5M. From their Fig. 3 the peak of the probability distribution of M g is around 2-3 TeV, 
which would correspond to M ~ 1 TeV. This should be compared with our Fig. 10 (the 
one based on a similar set of experimental data). In our case the peak of the distribution 
is around 400 GeV, showing that we get an extra penalization of the high-energy region, as 
explained in this paper. Unplugging (g — 2)^ and Dark Matter, the differences would have 
been more dramatic, especially if the allowed ranges of the parameters were stretched. 



Finally, in ref.[10| a frequentist analysis of the MSSM was presented. This is a point 
of view complementary to the Bayesian approach, followed here. The authors of ref.[10] 
perform a scan of the parameter space of the CMSSM (and also of the so-called NUHM1 
model), evaluating the likelihood (based on the x 2 )- This leads to zones of estimated 
probability (inside contours of constant x 2 ) around the best fit points in the parameter 
space. Their Fig. 1 (({M, m} plane) corresponds to our Fig. 10. However, notice that, in 
their case, the unplotted variables are optimized to obtain the best x 2 -, whereas in our case 
they are marginalized. Nevertheless, it is remarkable that the two figures are quite similar 
(especially comparing with our log-prior plot). This is an encouraging result. Indeed, the 
frequentist and Bayesian approaches must converge when the quality of data increases. 
This means that the bulk of the probability is centered around the best-fit points. This 
coincidence is also observed when the authors probe to unplug (g — 2) M (compare their Fig. 
2 to our Fig. 9). Since they do not explore to unplug Dark Matter, it is not possible to 
make further comparisons. It is likely, that in that case their 68% and 95% c.l. regions 
become much more extended in the parameter space, thus taking up a large portion of 
the high-energy (non-accesible to LHC) region. Notice that a frequentist approach cannot 
penalize those regions from fine-tuning arguments. Fine-tuning has to do with statistical 
weight (see subsect. 2.1) and a frequentist analysis is based in likelihood, i.e. the ability 
to reproduce the experiment. Without (g — 2) M and Dark Matter data, the experimental 
reasons to stick to low-energy are much less powerful. In other words, without (g — 2)^ and 
Dark Matter it is likely that the convergence between frequentist and Bayesian approaches 
is still weak. 



7. Summary and conclusions 

The idea of an MSSM forecast for the LHC is to use all the present (theoretical and exper- 
imental) information available to determine the relative probability of the different regions 
of the MSSM parameter space. This includes theoretical constraints (and perhaps preju- 
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dices) and experimental constraints. An appropriate framework to perform such forecast is 
the Bayesian approach, which allows a sensible statistical analysis, clearly identifying the 
objective and subjective pieces of information. The latter are incorporated in the prior, 
which represents the "theoretical" probability density that is assigned a priori to each point 
in the parameter space. Ignoring the prior factor is not necessarily the most reasonable 
or "free of prejudices" attitude. Such procedure is equivalent to a completely flat prior in 
the parameters. But one needs some theoretical basis to establish, at least, the parameters 
whose prior can be reasonably taken as flat. Besides, a choice for the allowed ranges of 
the various parameters is necessary in order to make statistical statements. The Bayesian 
approach allows to keep track of the influence of the prior (whether explicit or implicit) 
upon the results. 

In this paper we have performed a Bayesian analysis of the MSSM with universal soft 
terms at high energy (a scenario sometimes denoted CMSSM or MSUGRA). We have im- 
proved previous studies by means of a careful handling of the various pieces of information: 

• First, we do not incorporate ad hoc measures of the fine-tuning to penalize regions 
of the parameter space with large soft terms. Such penalization arises from the 
Bayesian analysis itself when the experimental value of Mz is considered on the same 
foot as the rest of experimental information (and not as a constraint of the model). 
Nicely, this permits to scan the whole parameter space, allowing arbitrarily large soft 
terms. Still, the low-energy region is statistically favoured (even before including dark 
matter or constraints). Incidentally, this statistical argument supports low-energy 
supersymmetry breaking (in the observable sector), even in a landscape scenario. On 
the other hand, in a frequentist analysis (thus ignoring the prior factor), the high- 
energy region is essentially non-disfavoured at all (before including dark matter or 
constraints), since it works as well as the ordinary SM. Using fine-tuning arguments 
to penalize it, would take us back to the choice of an implicit (and non-trivial) prior. 

• We have done a rigorous treatment of the nuisance variables, in particular Yukawa 
couplings, showing that the usual practice of taking the Yukawas as required to 
reproduce the fermion masses, approximately corresponds to taking logarithmically 
flat priors in the Yukawa couplings. We argue that this is a most reasonable choice. 



Although we start with the usual MSSM initial parameters, {m, M, A, B, fj,} (plus 
Yukawa couplings and other nuisance variables) we use an efficient (and actually 
quite common) set of variables to scan the MSSM parameter space. Besides trading 
\x by Mz and the Yukawa couplings by the fermion masses, it is extremely convenient 
to trade B by tan/3. These changes introduce a global Jacobian factor in the density 
probability when working in the new (and more suitable) parameters for the scan. 
Once the information about M^ xp is incorporated (by marginalizing Mz) the effective 



prior in the new variables inherits the Jacobian factor, as is explicit in eq.( |2.8[) . A 



quite accurate analytical expression for it is given in eq,( 2.15| ), which is valid for 



any MSSM (not just the CMSSM). This effective prior contains inside the above- 
mentioned penalization of fine-tuned regions, but we stress that the latter has not 
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been introduced by hand. Actually, these expressions for the effective prior contain 
no ad hoc constraints or prejudices, since the prior in the initial variables is still 
undefined. 

• We have developed a sensible prior in the initial variables. Our basic assumption 
has been that the soft-breaking terms share a common origin and, hence, it is logical 
to assume that their sizes are also similar. So, it is reasonable to assume that a 
particular soft term can get any value (with essentially flat probability) of the order 
of the typical size of the soft terms in the observable sector, Ms ~ F/A, or below 
it. Then, concerning the prior in Ms itself, we have taken the two basic choices: a 
flat prior and a logarithmic prior (i.e. flat in the magnitude of Ms). We perform 
the analysis for the two priors, even though we think the logarithmic one is more 
realistic, taking into account what we know about mechanisms of SUSY breaking. 
This allows us to quantify the dependence of the results on the choice of the prior. 



The second part of the paper (sections 3-5) is devoted to incorporate all the important 
experimental constrains to the analysis, obtaining a map of the probability density of the 
MSSM parameter space, i.e. the MSSM forecast for the LHC. Since not all the experimental 
information is equally robust, we perform separate analyses depending on the group of 
observables used. The main results are the following: 



• First we include only the most robust experimental data: EW. and B(D)-physics 
observables, and lower bounds on the masses of super symmetric particles and the 
Higgs mass. Then, the favoured region of the MSSM parameter space lies at low- 
energy, but there is a significant portion out of the LHC reach. This effect is more 
prominent in the case of flat prior, but it is visible both for flat and logarithmic prior. 
The main responsible for this situation is the lower bound on the Higgs mass: to be- 
come consistent with experiment the Higgs mass needs sizeable radiative corrections, 
which require larger soft terms. We show that increasing the Higgs mass in a few 
GeV affects substantially the amount of parameter space within the LHC reach. In 
consequence, if we wish to detect SUSY at LHC, let us hope that the Higgs mass is 
close to the present experimental limit. We also show the present preferred credibility 
interval for the Higgs mass (Table 3). 

• Then we add the information about a M . As is well-known, the impact of this observ- 
able depends dramatically on the way one computes the SM hadronic contribution. 
Using e + e _ — > had data (the most common choice in the literature), the soft terms 
are dramatically pushed into the low-energy region (for fi > 0), well inside the LHC 
reach. Actually, the push is so strong that the predictions for other observables, in 
particular b — > s 7, start to be too large. Furthermore, if the Higgs mass turns out to 
be O(10) GeV above the present experimental limit, the tension between the Higgs 
mass and would be dramatic and could not be reconciled: (a M ) would require 
too large (small) soft masses. 



- 41 - 



Using r-decay data, instead e + e~ —> had, there is no big discrepancy between 
and a^ xp , so the SUSY contribution does not need to be large. Consequently, the 
probability distributions are essentially unchanged by the inclusion of the a M con- 
straint. Although the more direct e + e~ data are usually preferred to evaluate a^ M , 
this discrepancy is warning us to be cautious about this procedure. 

• We then consider the impact of Dark Matter (DM) constraints (unplugging con- 
straints), namely we require that the WIMP responsible for Odm is the supersym- 
metric LSP (typically the lightest neutralino). As is known, this selects four regions 
in the parameter space: Bulk, Focus point, Co-annihilation and Higgs- funnel. When 
all the SUSY parameters, but the universal scalar and gaugino mass (m and M re- 
spectively), are marginalized, these regions appear clearly in the m — M plots. One 
can observe a certain blurring with respect to usual plots in the literature, due to the 
integration in the other variables. 

Roughly speaking, including DM constraints the low-energy gets favoured and there- 
fore the detection of SUSY at the LHC. However, there survive large high-energy 
areas out of the LHC reach. Consequently, again, if we are unlucky, even if DM 
is super symmetric, it could escape LHC detection (especially if the Higgs mass is 
not close to its present experimental limit). In any case, we have stressed that one 
should be cautious at interpreting these results as a robust constraint on the CMSSM, 
discussing possible ways-out for the (in principle) disfavoured regions. 

• Finally we consider all constraints at the same time (including and ^dm)- The 
bulk and co-annihilation regions are now clearly selected (for \i > 0) amongst the 
various possibilities to obtain £Idm- Again, these results have to be taken with 
caution. 

• We perform a similar analysis for \i < 0. The most important change in the results 
is that when (using e + e~ data) is taken into account, this scenario becomes un- 
favoured, as it cannot reproduce the experimental measure. Besides, there is no push 
of the MSSM parameters into the low energy region (actually, the opposite is true, 
but in a mild way). 

We also perform a Bayesian comparison of both scenarios, showing quantitatively 
how better the ji > case performs than the \i < one. Actually, the advantage of 
the positive-^ case only occurs when a M constraints (using e + e~ data) are taken into 
account. 

In summary, LHC offers an exciting horizon for SUSY discovery, but there is still a 
non-negligible possibility that it escapes detection, especially if the Higgs mass is not close 
to its present experimental value. So we should cross fingers. 
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